Moving beyond de novo clustering in fungal community ecology.

نویسندگان

  • Lauren C Cline
  • Zewei Song
  • Gabriel A Al-Ghalith
  • Dan Knights
  • Peter G Kennedy
چکیده

High throughput sequencing (HTS) has rapidly become the de facto tool for characterizing microbial community structure in a wide variety of habitats (Caporaso et al., 2011; Peay et al., 2016; Truong et al., 2017). Accompanying the expanding use of HTS to quantify microbial diversity is the need to delineate species, the ecological unit traditionally used to compare the richness and composition of communities across treatments, locations or habitats (Magurran, 2005). Due to the challenges in identifying microbial species using morphology or biology alone, designations are typically made by ‘binning’ DNA sequences that meet a similarity threshold into operational taxonomic units (OTUs; Blaxter et al., 2005). Currently, the most widely employed approach for defining fungal OTUs is done according to similarities among sequences within the dataset (Supporting Information Fig. S1). Commonly referred to as de novo clustering (Bik et al., 2012), this approach requires no input database as a reference, which is advantageous when characterizing communities with little a priori knowledge. Despite this benefit, the ecological insights gleaned from de novo clustering can be limited by the challenge of directly comparing OTU identity across different studies (€ Opik et al., 2014), and the coarse phylogenetic resolution of many taxonomic assignments (Halwachs et al., 2017). One alternative to de novo clustering is the closed reference approach, where OTUs are binned according to sequence similarity of those in a reference database. With this approach, both OTU clustering and taxonomic designations occur simultaneously. Although the use of closed reference clustering in fungal ecology has been scarce (Fig. S1), it has become increasingly common in the molecular characterization of arbuscular mycorrhizal (AM) fungal communities as well as in many bacterial ‘microbiome’ studies (€ Opik et al., 2014; Kelly et al., 2016). The relatively low taxonomic and phylogenetic diversity of AM fungal communities (Stajich et al., 2009; Redecker et al., 2013; Davison et al., 2015), combined with a curated database (€ Opik et al., 2010) and increasingly wide usage of the 18S rRNA gene for molecular characterization ( € Opik et al., 2014), may explain why AM fungal community ecologists (relative to other fungal ecologists) have readily embraced closed reference clustering. Notably, the closed reference clustering approach has contributed significant new ecological understanding to patterns of AM community assembly by tracking OTUs (referred to as VT, € Opik et al., 2010) across studies with both contrasting habitat types and a wide variety of spatial scales (Davison et al., 2015; Garc ıa de Le on et al., 2016). A second alternative to de novo clustering is an open reference approach, which first clusters sequences to a reference database, followed by de novo clustering of the remaining unmatched sequences. This hybrid approach can combine the advantages of the two aforementioned clustering approaches (Rideout et al., 2014; He et al., 2015), but its interpretation can be problematic if the OTU definitions between closed reference and de novo approaches differ. Although open reference clustering is the least commonly used in fungal community ecology analyses to date (Fig. S1), it has been employed in studies of both arbuscular mycorrhizal and ectomycorrhizal fungal communities (Dumbrell et al., 2010; Jarvis et al., 2015). The increasingly widespread adoption of reference-based clustering inmanymicrobial analyses raises the question: should fungal ecologists re-consider their default use of de novo clustering? In particular, it seems that reference-based clusteringmay represent an increasingly useful approach to fungal community analyses as databases such as UNITE (K~oljalg et al., 2013) grow in size and a greater diversity of fungal habitats are molecularly characterized. Recent studies have suggested that reference-based clustering can increase OTU stability and taxonomic accuracy relative to de novo clustering (He et al., 2015; Halwachs et al., 2017), although how this clustering approach influences fungal community analyses across diverse habitats is currently unclear. To assess this gap in knowledge, we compared the relative performance of de novo, closed reference, and open reference clustering approaches on a mock community, as well as samples from four ecologically distinct habitats. These habitats varied in the degree to which fungal composition was captured by the UNITE database, providing an opportunity to investigate the importance of a priori habitat characterization on clustering approach performance. Using dead wood, live wood, live leaf and forest soil samples, we quantified fungal species assignments, OTU richness and community composition from ITS1 amplicon libraries sequenced on the IlluminaMiSeq platform.We compared two de novo clustering algorithms (CD-HIT and USEARCH; Li & Godzik, 2006; Edgar, 2010), two closed reference clustering algorithms (BLAST and NINJA-OPS; Altschul et al., 1990; Al-Ghalith et al., 2016), as well as two open reference clustering scenarios (NINJA/USEARCH; BLAST/ CD-HIT) applying a 97% sequence similarity cutoff for OTU clustering aswell as taxonomy assignments (Table S1). For the open reference clustering, sequences were first clustered by a closed reference algorithm (i.e. NINJA or BLAST); the remaining sequences that failed to cluster were then clustered by a de novo clustering approach (i.e. USEARCH or CD-HIT), and the OTU tables were combined (sensu Rideout et al., 2014). The UNITE database (v.7.0) was used for reference-based clustering as well as for designating

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering of Short Read Sequences for de novo Transcriptome Assembly

Given the importance of transcriptome analysis in various biological studies and considering thevast amount of whole transcriptome sequencing data, it seems necessary to develop analgorithm to assemble transcriptome data. In this study we propose an algorithm fortranscriptome assembly in the absence of a reference genome. First, the contiguous sequencesare generated using de Bruijn graph with d...

متن کامل

Global Health in the Anthropocene: Moving Beyond Resilience and Capitalism; Comment on “Health Promotion in an Age of Normative Equity and Rampant Inequality”

There has been much reflection on the need for a new understanding of global health and the urgency of a paradigm shift to address global health issues. A crucial question is whether this is still possible in current modes of global governance based on capitalist values. Four reflections are provided. (1) Ecological–centered values must become central in any future global health framework. (2) ...

متن کامل

Finding Exact and Solo LTR-Retrotransposons in Biological Sequences Using SVM

Finding repetitive subsequences in genome is a challengeable problem in bioinformatics research area. A lot of approaches have been proposed to solve the problem, which could be divided to library base and de novo methods. The library base methods use predetermined repetitive genome’s subsequences, where library-less methods attempt to discover repetitive subsequences by analytical approach...

متن کامل

Analysis, Optimization and Verification of Illumina-Generated 16S rRNA Gene Amplicon Surveys

The exploration of microbial communities by sequencing 16S rRNA genes has expanded with low-cost, high-throughput sequencing instruments. Illumina-based 16S rRNA gene sequencing has recently gained popularity over 454 pyrosequencing due to its lower costs, higher accuracy and greater throughput. Although recent reports suggest that Illumina and 454 pyrosequencing provide similar beta diversity ...

متن کامل

Microbial Contamination of Leafy Vegetables in Porto-Novo, Republic of Benin

Background: The vegetables provide important nutrients to human beings. Nevertheless, contaminated vegetables can cause health problems because of their microbial load. The aim of this study was to assess the microbial quality of three main leafy vegetables cultivated and consumed at Porto-Novo in Republic of Benin. Methods: Totally, 36 samples of amaranth, nightshade, and lettuce were taken f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • The New phytologist

دوره 216 3  شماره 

صفحات  -

تاریخ انتشار 2017